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Abstract 

The generalized extreme value distribution and its particular case, the Gumbel extreme 
value distribution, are widely applied for extreme value analysis. The Gumbel distribution 
has certain drawbacks because it is a non-heavy-tailed distribution and is characterized by 
constant skewness and kurtosis. The generalized extreme value distribution is frequently 
used in this context because it encompasses the three possible limiting distributions for 
a normalized maximum of infinite samples of independent and identically distributed ob¬ 
servations. However, the generalized extreme value distribution might not be a suitable 
model when each observed maximum does not come from a large number of observations. 
Hence, other forms of generalizations of the Gumbel distribution might be preferable. 
Our goal is to collect in the present literature the distributions that contain the Gum¬ 
bel distribution embedded in them and to identify those that have flexible skewness and 
kurtosis, are heavy-tailed and could be competitive with the generalized extreme value 
distribution. The generalizations of the Gumbel distribution are described and compared 
using an application to a wind speed data set and Monte Carlo simulations. We show 
that some distributions suffer from overparameterization and coincide with other general¬ 
ized Gumbel distributions with a smaller number of parameters, i.e., are non-identifiable. 
Our study suggests that the generalized extreme value distribution and a mixture of two 
extreme value distributions should be considered in practical applications. 

Key words: Generalized extreme value distribution; Gumbel distribution; Heavy-tailed 
distribution; Non-identifiable model; Kurtosis; Wind speed. 
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1 Introduction 


Extreme value data usually exhibit excess kurtosis and/or heavy right tails. This is particularly 
common in environmental data, e.g., maximum water level (Bruxer et ah, 2008), maximum 
wind speed (Castillo et ah, 2005, Examples 6.1 and 9.14), spatial and temporal variability of 
turbulence (Sanford, 1997), daily maximum ozone measurement (Gilleland, 2005), and largest 
lichen measurements (Cooley et ah, 2006). The generalized extreme value distribution (GEV) 
is fairly well-accepted as a standard working model. Despite such well-established theory, 
extreme-value distributions are not always preferred in studies of empirical data that do not 
contemplate the conditions to use extreme value theory results. Sometimes, the £t for finite 
samples is poor. To surpass these issues, other generalizations of the Gumbel distribution 
were proposed. For instance, Hosking (1994) proposed a four-parameter distribution to model 
the maximum precipitation data that has been used in many fields including environmental 
sciences, see Hosking & Wallis (1997), Parida (1999), Park h Jung (2002), and Sing & Deng 
(2003); Reed & Robson (1999, §17.3.2) recommends a particular three-parameter generalized 
logistic distribution as preferable to a GEV distribution for UK annual flood maximum. 

The Gumbel distribution, also known as the extreme value distribution or the Gumbel ex¬ 
treme value distribution, is also used to model extreme values (Coles, 2001; Castillo et ah, 2005; 
Ferrari & Pinheiro, 2012, 2015). However, its skewness and kurtosis coefficients are constant, 
and its right tail is light. Generalizations of the Gumbel distribution with flexible skewness and 
kurtosis coefficients could provide better fits for extreme value data. 

We present a comprehensive comparative review of distributions that contain the Gumbel 
distribution as a special or limiting case. We note that certain generalizations of the Gumbel 
distribution proposed in the literature are not identifiable. ^ Some distributions suffer from 
overparameterization and coincide with other generalized Gumbel distributions with a smaller 
number of parameters. As noted by Huang (2005) “when applying a nonidentifiable model, 
different people may draw different conclusions from the same model of the observed data. 
Before one can meaningfully discuss the estimation of a model, model identifiability must be 
verified. ” Therefore, we distinguish between the identihable and nonidentihable models and 
limit our study to the identifiable family of distributions only. 

We investigate and compare the relevant properties of the selected distributions. In particu¬ 
lar, we derive their coefficients of skewness and kurtosis, which are invariant under location-scale 
transformations and are primarily controlled by the extra parameters. We graphically illustrate 
their flexibility relative to the Gumbel distribution and highlight those that can achieve high 
values of skewness and kurtosis with a heavy right tail. 

Danielsson et ah (2006) stated “heavy-tailed distributions are often defined in terms of higher 

^ A family of distributions with probability density function f{x;9),9 G 0, is said to be identifiable if, for 
any 9 and 9* in the parameter space 0, f{x; 9) = f{x; 9*) ^9 = 9*. 
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than normal kurtosis. However, the kurtosis of a distribution may be high if either the tails of 
the cumulative distribution function are heavier than the normal or the center is more peaked 
or both. ” Moment-based measures suffer from effects from an extreme tail of the distribu¬ 
tion, which may have negligible probability. These characteristics motivated us to study the 
tail behavior of the distributions specihcally. To mathematically classify the tail behavior of 
distributions, we employ regular variation theory (de Haan, 1970) and a criterion proposed 
by Rigby et ah (2014) based on an approximation of the logarithm of the probability density 
function. 

Additionally, we conduct a comprehensive simulation study to evaluate the flexibility of 
each selected distribution in htting data sets generated from the Gumbel distribution and its 
different generalizations. The simulated data sets cover a reasonable range of skewness, kurtosis 
and tail heaviness behaviors. We compare the different distributions through the analysis of 
a data set on the maximum monthly wind speed in West Palm Beach, Florida, for the years 
1984-2014. 

The paper is organized as follows. In Section 2, we present the Gumbel distribution and its 
generalizations. In Section 3, we study the right tail heaviness of the identihable distributions. 
Monte Garlo simulations are presented in Section 4, and an application to a real data set is 
provided in Section 5. The paper ends with conclusions in Section 6 . Technical details are 
given in the Supplement. 


2 The Gumbel distribution and its generalizations 

We present selected characteristics of the Gumbel distribution and distributions that contain 
the Gumbel distribution as a special or limiting case. For the identihable distributions, the 
moments, p-quantile (xp), skewness ( 71 ) and kurtosis ( 72 ) coefhcients are summarized in the 
Supplement. Random draws from distributions with closed-form p-quantiles can be generated 
by replacing p with a standard uniform distributed observation. For the others, generating 
methods are given. 


Gumbel distribution (EV). Let X ~ YNmaxik'-, <^) be a continuous random variable with 
a maximum extreme value distribution. The probability density function (pdf) and cumulative 
distribution function (cdf) are, respectively. 
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where p G iR is the location parameter and a > 0 is the scale parameter. This distribution 
is also known as the Gumbel or type I extreme value distribution. The distribution in (1) 
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is one of the three possible limiting laws of the standardized maximum of independent and 
identically distributed random variables (Gnedenko, 1943). It is frequently invoked to model 
extreme events; see, e.g., Castillo et ah (2005, Table 9.16) and Coles (2001, Section 3.4.1). We 
refer to this distribution as the maximum extreme value distribution to distinguish it from 
the minimum extreme value distribution, which is also often known as the Gumbel or type I 
extreme value distribution in the statistical literature. 

The coefficients of skewness and kurtosis of the Gumbel distribution are constant 71 ,ev = 
1.14 and 72 ,ev = 5.4, respectively, i.e., parameter independent. This restriction motivates more 
flexible and useful generalizations of the Gumbel distribution to £t real data. 

Hereafter, the maximum extreme value or Gumbel variable will be referred to as Gumbel 
and denoted by EV. 


Generalized extreme value distribution (GEV). The generalized extreme value distri¬ 
bution (GEV) was dehned for the hrst time by Jenkinson (1955), and the three possible limiting 
distributions of the maximum/minimum of random variables are embedded within it. This dis¬ 
tribution is also known as the von Mises extreme value, von Mises-Jenkinson, and Fisher-Tippet 
distribution. A historical review of extreme value theory, the main results, and a list of several 
areas of application are provided in Kotz & Nadajarah (2000). 

Let X ~ GEV(/i, a, a) be a generalized extreme value distributed random variable. Its pdf 
and cdf are, respectively. 
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where a ^ IR. The Gumbel distribution is a particular case of the GEV distribution when 
o —^ 0. 

Plots of the pdf and the .99 quantile of GEV(0,l,a), and the skewness and kurtosis of 
GEV(/i, cr, a) for selected values of a are shown in Figure 1. The GEV distribution is quite 
versatile, and a has a substantial effect on its skewness and kurtosis. The parameter a affects 
location, dispersion, skewness and kurtosis. Increasing values of a increases the quantiles, 
skewness and kurtosis coefficients, and right-tail heaviness. Skewness is dehned for o < 1/3 
and kurtosis for o < 1/4. Skewness and kurtosis can assume different values from those of the 
Gumbel distribution. 
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Figure 1: Density function and .999 quantile - GEV(0,l,a); skewness and kurtosis - 
GEV(/r, a, a) 


Exponentiated Gumbel distribntion (EGn). Let X ~ EGu(/i, a, a) be an exponentiated 
Gumbel distributed random variable. Its pdf and cdf are, respectively, 
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where a > 0 (Nadarajah, 2006). The Gumbel distribution is a special case of the EGn distri¬ 
bution when a = 1. 

The pdf can be written as 

fEGu{x; /i, a, a) = 
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where 2 Ri{a,b;c; z) = Y^^=o[i'^)k{b)k/{c)k][z^/k\] for \z\ < 1 is the hypergeometric function, 
{a)k = r(a -|- k)/r{a) is the Pochhammer symbol and r(-) is the gamma function. Thus 
2 ^ 1 ( 0 ,1; 1; z) = ~ (1—for \z\ < 1. Note that |exp (— exp (—(x — /i)/((x)))| < 

1. This form of the pdf is computationally highly efficient for evaluating moments of the EGu 
distribution if using software that contains an optimized implementation of the hypergeometric 
function. 

The right tail is heavier for smaller values of a > 0 (Figure 2). When a is close to zero, 
minor changes in a lead to signihcant changes in the quantile values. The skewness and kurtosis 
can reach values close to 2 and 9, respectively, indicating that the EGu distribution is more 
flexible than the Gumbel distribution. 


Transmuted extreme value distribution (TEV). Shaw & Buckley (2009) dehned a trans¬ 
formation known as the rank transmutation map, with the aim of obtaining distributions with 
skewness and kurtosis distinct from those of the normal distribution. 
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Figure 2: Density function and .99 quantile - EGu(0,1, a); skewness and kurtosis - EGu(/i, a, a) 


Transmutation is a composite map of a cumulative distribution function with a quantile 
function of another distribution dehned on the same sample space. A particular case of rank 
transmutation map is derived by considering 

Tn{u) = F 2 (Ff ^(u)) = u + au{l - u), 


which leads to 

F 2 {x) = (1 + a)Fi{x) - aFi{x), 


for I a I < 1, known as the quadratic rank transmutation. There are two important boundary 
cases. When a = —1, F 2 {x) = Fi(x)^, i.e., F 2 is the distribution of the maximum of two 
independent variables with distribution F\. Analogously, when a = 1, F 2 is the distribution 
of the minimum. Motivated by the various applications of the extreme value theory, particu¬ 
larly the Gumbel distribution, Aryal & Tsokos (2009) dehned a new distribution known as the 
transmuted extreme value distribution (TEV) by replacing Fi with a Gumbel cdf. 

Let X ~ TEV(/i, a, a) be a transmuted extreme value distributed random variable. Its pdf 
and cdf are, respectively. 
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where |a| < 1. The Gumbel distribution is a particular case of the TEV distribution when 
a = —1 or a = 0. Note that to make the TEV(/i, a, a) family of distributions identihable, it is 
sufficient to restrict a to the set (—1,1]. 

The TEV distribution is more hexible relative to the Gumbel distribution but less hexible 
than the GEV and EGu distributions, with maximum .99 quantile (for /r = 0 and a = 1) and 
coefficients of skewness and kurtosis lower than 6, 2 and 7, respectively (Figure 3). Note that 
from the pdf and .99 quantile plots, the right tail gets heavier for smaller values of —1 < a < 1. 


6 

































Figure 3: Density function and .99 quantile - TEV(0,1, a); skewness and kurtosis - TEV(/i, a, a) 


Kumaraswamy Gumbel distribution. Cordeiro et al. (2012) defined a generalization of a 
cdf G{x) from the Kumaraswamy distribution, which they referred to as Kum-G. The cdfs of 
the Kumaraswamy and Kum-G distributions are given, respectively, by 

FKum(a:;a,/3) = 1 - {1 - a;e(0,1), 


and 

FKnm-G{x;a,(3) = I-{I- X e R, 
where a > 0 and > 0. If the distribution G{x) is EV(/x, a), the cdf is defined by 


EKumGum ^5 *^5 /^) ^ 
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where a > 0 and /3 > 0. 

Note that if X ~ KumGum(p, a, a, f3), then 


FKumGum(a;;/i, cr, a,/3) = 1 - [1 - exp(-exp(-(a; -/i*)/a))]^ 

FkumQujii(x, /i , (T, 1, /3) F]7 ;Qu(x, /i , (J, /3), 


where /i* = /i-t-alna. Therefore, the Kumaraswamy Gumbel family of distributions KumGum 
(p, cr, a,/9), where a > 0 and /3 > 0, is nonidentihable. It coincides with the exponentiated 
Gumbel family of distributions EGu(/i*, a,/3), where E M, a > 0 and (3 > 0. In other 
words, the Kumaraswamy Gumbel family of distributions has four parameters but corresponds 
to a family with only three parameters. This is a typical case of parameter redundancy, i.e., 
overparameterization (Gatchpole & Morgan, 1997). Therefore, this distribution will not be 
contemplated hereafter. 


Generalized three-parameter Gumbel distribution (GTIEV3). Dubey (1969) built a 
generalization of the Gumbel distribution which is known as the generalized type I extreme 
value or type I generalized logistic distribution, and we denote it by GTIEV(/i, a, a, (3). Its cdf 
is given by 
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where a > 0 and /S > 0. This distribution was first dehned by Hald (1952). Note that 
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where fi* = /i + crln((jQ!/9“^) G M. Therefore, the generalized Gumbel family GTIEV(/i, a, a, (3), 
where a>0,a>0, and /3 > 0, is nonidentihable. It coincides with a family of 

distributions with only three parameters, say GTIEV3(/i, a, a), where fi E M, a > 0 and a > 0. 

Let X ~ GTIEV3(/i, (T, a) be a generalized three-parameter Gumbel random variable. Its 
pdf and cdf are dehned, respectively, as 
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where a > 0. The Gumbel distribution is a limiting case of GTIEV3 when a oo. The 
three-parameter kappa distribution dehned in Jeong et ah (2014, eq. 2) with positive shape 
parameter coincides with the GTIEV3 distribution in a diherent parameterization. 

The GTIEV3 distribution is more hexible than the Gumbel distribution but less hexible 
relative to the GEV and EGu distributions (Figure 4). The .99 quantile and skewness coef- 
hcient are always lower than the corresponding Gumbel values whereas the kurtosis can be 
greater. The right tail of the GTIEV3 distribution can not be heavier than that of the Gumbel 
distribution, in contrast to its left tail. This observation suggests that the GTIEV3 distribution 
is not useful for modeling right-skewed data 



Figure 4: Density function and .99 quantile - GTIEV3(0,1, a); skewness and kurtosis - 
GTIEV3(/i, a, a) 


Three-parameter exponential-gamma distribution (EGa). Ojo (2001) presents a gen¬ 
eralization of the Gumbel distribution, with three parameters /i, a, and a. We refer to it as 
the three-parameter exponential-gamma distribution and denote it by EGa(/i, a, a). 
























Let X ~ EGa(/i, a, a) be an exponential-gamma distributed random variable. Its pdf and 
cdf are, respectively, 
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where a > 0, and r(s,a:) = exp(—t)dt is the incomplete gamma function. The Gumbel 

distribution is a particular case of EGa when a = 1. To generate X ~ EGa(/i, a, a), we write 
X = p — (Tln(y), where Y ~ gamma(a, 1).^ 

Similarly to the EGu distribution, the right tail gets heavier for smaller values of a > 0 
(Figure 5). For a close to zero, the .99 quantile can be greater than that of the Gumbel, 
TEV and GTIEV3 distributions. The .99 quantile plots indicate that when a is close to 
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Figure 5: Density function and .99 quantile - EGa(0,1, a); skewness and kurtosis - EGa(/i, a, a) 


Generalized Gumbel distribution (GGu) Gooray (2010) derived a distribution which is 
referred to as the generalized Gumbel distribution (GGu). 

Let X ~ GGu(/i, cr, a) be a generalized Gumbel distributed random variable. Its pdf and 
cdf are, respectively. 
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^The parameterization for the gamma distribution is such that, ii W gamma(Q;,/3), its pdf is f{w) = 
(/3“/r(a))ri;““^ exp(—dw), w > 0. 
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where fj. & JR, a > 0 and a >^. When a = 1, the GGu distribntion reduces to a Gumbel 
distribution. Figure 6 shows the plots of the pdf for selected parameters, and the .99 quantile, 
skewness and kurtosis of GGu(0,l,a). Similarly to the EGu and EGa distributions, for a 
close to zero, the .99 quantile can be greater than that of the Gumbel, TEV and GTIEV3 
distributions. The . i 

lead to signihcant c 
greater than those o 



Figure 6: Density function, .99 quantile, skewness and kurtosis - GGu(0,1, a) 


Exponential-gamma distribntion. As a generalization of the Gumbel distribution, 
Adeyemi & Ojo (2003) proposed the asymptotic distribution of the r-th maximum extremes 
obtained by Gumbel (1935), whose pdf is 


^ exp(—r exp(—x)) exp(—rx), x & JR, 

for r > 0, the shape parameter. When r = 1, this distribution reduces to a Gumbel distribu¬ 
tion. Its generalized form is known as the exponential-gamma distribution ExpGama(/i, a, a, fJ) 
(Balakrishnan & Leung, 1988, p. 34) and is dehned by the pdf and cdf given by, respectively. 
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where a E JR and /3 > 0. When a = (3 = 1, the exponential-gamma distribution reduces 
to a Gumbel distribution, and it reduces to the EGa distribution when a = 1. Note that, if 


^Cooray (2010) considers the parameter space ^ G M and 0 < acr < oo. 
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X ~ ExpGama(/x, a, a,/9) then 
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where /r* = /i + alna G iR. Hence, the exponential-gamma family of distributions 
ExpGama(/i, a, a, /3), where pGiR, cr>0, Q!>0 and /5 > 0, is nonidentihable. It coincides with 
the three-parameter exponential-gamma family of distributions EGa(/i, a,/5), where fi E M, 
a > 0 and (3 > 0. Therefore, this distribution will not be contemplated hereafter. 


Type IV generalized logistic distribution (GLIV). Prentice (1975) proposed the type IV 
generalized distribution (GLIV). Let X ~ GLIV(yU., a, a, (3) be a type IV generalized distributed 
random variable. Its pdf and cdf are, respectively. 
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where a > 0, P > 0 and 2 ^ 1 ( 0 , 5; c;z) is the hypergeometric function mentioned previously. 
When a = 1 and /9 —)> 00 , the type IV generalized logistic distribution reduces to a Gumbel 
distribution, and it reduces to a generalized three-parameter Gumbel distribution (GTIEV3) 
when 0 = 1. To generate X ~ GLIV(/i, a, o,/3), write X = — alnV, where Y ~ F(2o,2/3).‘^ 

Similarly to EGu and EGa, for hxed P, the right tail gets heavier for smaller o > 0 (Figure 7). 
For a values close to zero, the .99 quantile can be greater than the Gumbel, TEV, and GTIEV3 
values. The quantile plots indicate that, when a is close to zero, small changes in a lead to 
signihcant changes in the quantile values. The skewness and kurtosis can reach values close to 
2 and 9, respectively, indicating that the GLIV distribution is more flexible than the Gumbel 
distribution. We can verify that /gliv(^) P) = fchwi—x, /S, o), and thus, for hxed o, the left 
tail is heavier for small values of p. 

Prentice (1976) presents a simplihed form of this distribution. When a = P, the type IV 
generalized logistic distribution is symmetric about x = fi, and the distribution is known as the 
type III generalized logistic distribution. 


Exponentiated generalized Gnmbel distribution. Gordeiro et ah (2013) dehned a class 
of distributions known as the exponentiated generalized distribution (EG), by 

F(x) = |l-{l-GWrf. 

4 If p/ ^ F(a,6) its pdf is f{w) = (l/i3(a, &))(a/6)“w““^/(l -I- {a/b)wp‘^^^\ for w > 0 . 
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Figure 7: Density function and .99 quantile - GLIV(0,1, a,/3); skewness and kurtosis - 
GLIV(/i,a,a,/3) 


where a > 0 and /3 > 0 are two additional shape parameters and G{x) is a continuous cdf. 
When G{x) is the Gumbel cdf, EG becomes the exponentiated generalized Gumbel distribution 
(EGGu). Let X ~ EGGu(/i, a, a,/3) be an exponentiated generalized Gumbel distributed 
random variable with cdf 


^EGGu(a;;h,cr,a,/^) 
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where a > 0 and /3 > 0. The Gumbel distribution is a particular case of EGGu when a = 13 = 1 
and the aforementioned exponentiated Gumbel distribution EGu is a special case when (3 = 1. 
Note that, if X ~ EGGu(/i, a, 1, (3), then 


FEGGnix] fi,a, 1,(3) = exp(-exp(-(x- (/i + (Tln/3))/(T)) = Feggu(x;/ r*, a, 1,1) = F^vix] fi*,a), 

where fi* = n+a In (3. Hence, the exponentiated Gumbel family of distributions EGGu(yU., a, a, (3) 
where /rGiR, cr>0,a>0 and /3 > 0, is nonidentihable. It coincides with the Gumbel family 
of distributions EV(/i*,cr), where fi* E M and a > 0, when a = 1. Therefore, this distribution 
will not be considered further. 


Beta Gumbel distribution. Nadarajah & Kotz (2004) proposed a generalization of the 
Gumbel distribution, which they referred to as the beta Gumbel distribution (BG), from a 
generalized class of distributions dehned by 
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for a > 0 and (3 > 0, where G{x) is a cdf, B{a,(3) is the beta function and 
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is the incomplete beta function, by taking G{x) as the Gumbel cdf. 
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Let X ~ BG(/i, a, a, (3) be a beta Gumbel distributed random variable with cdf 
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where a > 0 and (3 > 0. When a = 1 and (3 = 1, the beta Gumbel distribution reduces to 
a Gumbel distribution, and it reduces to an exponentiated Gumbel distribution (EGu) when 
a = 1. 

If X ~ BG(/i, a, a, 1), then 

rexp(- exp(-(3:-p)/cr)) 

Fbg{x; /i, a,a,l) = a / = Fbg(x; fi*, a, 1,1) = Fev(x; fi*, a), 

Jo 

where fi* = fi + alna. Therefore the beta Gumbel family of distributions (3) with 

& M, a > 0, a > 0 and /? > 0 is nonidentihable. It coincides with the Gumbel family of 
distributions Ey{fi*,a) with fi* E M and a > 0 , when (3 = 1. Therefore, this distribution will 
not be studied hereafter. 


Kummer beta generalized Gumbel distribution. Pescim et al. (2012) dehned a class 
of distributions known as the Kummer beta generalized family (KBG). From an arbitrary cdf 
G{x), the KGB family of distributions is dehned by 


rG{x) 

F{x) = K / exp(— 

Jo 


where a > 0, (3 > 0 and 'j E M are shape parameters and 

.-1 r(a)r(6) 


K-^ = 


iFi(a;a + 6; -c). 


r(a + b) 

where iKi(a; a+b; —c) = •] conhuent hypergeometric function, 

(d)k = d{d + + A; — 1) denotes the ascending factorial, and (d)o = 1. When G{x) is a 

Gumbel cdf, it is known as the KGB-Gumbel distribution (KGBGu). 

Let X ~ KBGGu(/i, (T, a,/3, 7 ) be a KGB-Gumbel distributed random variable with cdf 

/•exp(exp(-(a;-At)/o-)) 

^KBGGu = F: F~^{1 - tf~^ exp{-'yt)dt, 

Jo 

where a > 0, (3 > 0 and E M. When a = 1, (3 = 1 and 7 = 0, the KGB-Gumbel distribution 
reduces to the Gumbel distribution and it reduces to a beta Gumbel distribution when 7 = 0 . 
If X ~ KBGGu(/i, a, a, 1, 0), then 

rexp(- exp(-(x-p)/(T)) 

Fkbggu{x; /i, a, a, 1, 0) = a / F~^dt = FKBGGn{x; /i*, a, 1,1, 0) = Fey^x; /i*, a), 

Jo 

where jj.* = fi + alna. Hence, the Kummer beta Gumbel family of distributions 
KBGGu(/i, cr, a,/3, 7 ) with fiEM, a>0, a>0, (3>0 and ■j E M is nonidentihable. It 
coincides with the Gumbel family distributions EV(yn*, a) with jj* E IR and a > 0, when (3 = 1 
and 7 = 0 . Therefore, this distribution will not be examined in the following discussion. 
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Two-component extreme value distribution (TCEV). In studying annual flood series, 
Rossi et al. (1984) considered an approach to account for both the presence of outliers and high 
skewness. That approach results from assuming that flood peaks do not all arise from one and 
the same distribution but, instead, from a two-component extreme value mixture (TCEV). One 
of the components generates ordinary (more frequent and less severe in the mean) floods. The 
other exhibits much greater variability and tends to generate more rare but more severe floods. 

Let X ~ TCEV(/i, a, /xi, ai, a) be a two-component extreme value distributed random vari¬ 
able. Its pdf and cdf are, respectively. 



fTCEY^x; fi, cr, /ii, (Ji, a) 


and 



where fi ^ IR and /ii G iR are location parameters, a > 0 and ai > 0 are dispersion parameters 
and 0 < a < 1. Greater values of a increase the weight of the second component. 

If 0 < a < 1, E(x; /r, a, fii, di, a) = F{x; /ii, ai, /i, a, (1 —a)), then consequently, the mixture 
is nonidentiflable. The lack of identiflability due to the label-switching effect is overcome by 
imposing identiflability constraints on the parameters. It is sufficient to consider 0 < a < 0.5 
to achieve identifiability^. When a —?■ 0, the two-component extreme value distribution reduces 
to a Gumbel distribution. 

Data from X ~ TCEV(yU., a, fii, di, a) may be generated from the conditional distributions 
X\Z = 0 ~ EV{fi,a) and X\Z = 1 ~ EV{fXi, ai), where Z ~ Bernoulh(a). 

Figure 8 shows the plots of the pdf for selected parameters, and the .99 quantile, skewness 
and kurtosis of TCEV(0,1,10, 5, a). Note that the probability of high values of the random 
variable and the .99 quantile grow with a. The skewness and kurtosis coefficients are smaller 
than the corresponding Gumbel values for all a. 

As a summary. Table 1 presents the generalizations of the Gumbel presented above; the 
nonidentifiable distributions are marked with an asterisk. In the following sections, we will 
consider all the identifiable family of distributions, namely EV, GEV, EGu, TEV, GTIEV3, 
EGa, GGu, GLIV, and TGEV. 

®For purposes of parameter estimation, we follow Aitkin & Rubin (1985), who suggest theoretical parameter 
constraints but no parameters constraints for estimation. 
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Figure 8: Density function, .99 quantile, skewness and kurtosis - TCEV(0,1,10, 5, a) 


Table 1: Generalizations of the Gumbel distribution. 


Distribution 

Generalized extreme value GEV(/i., cr, a) 

Type IV generalized logistic GLIV(/i., a, a) 
Two-component extreme value TCEV(/i., cr, /ii, cri, a) 
Three parameter exponential-gamma EGa(/i, cr, cx) 
Exponentiated Gumbel EGu(/i, cr, a) 

Transmuted extreme value TEV(/i., cr, a) 

Generalized Gumbel GGu(/i, cr, a) 

Generalized three-parameter Gumbel GTIEV3(/i, cr, a) 
Generalized type I extreme value GTIEV(/i., cr, a, /5) * 
Exponential-gamma ExpGamma(/i., cr, a,/3) * 

Beta Gumbel BG(/j-, cr, ct,/3) * 

Kummer beta generalized Gumbel KBGGu(/i, cr, ct,/3, 7 ) * 
Kumaraswamy Gumbel KumGum(/i., cr, a,/3) * 
Exponentiated generalized Gumbel EGGu(/i, cr, a,/9) * 


Proposed by 
Jenkinson (1955) 
Prentice (1975) 
Rossi et al. (1984) 
Ojo (2001) 
Nadarajah (2006) 
Aryal h Tsokos (2009) 
Cooray (2010) 
Jeong et al. (2014) 
Dubey (1969) 
Adeyemi Ojo (2003) 
Nadarajah Kotz (2004) 
Pescim et al. (2012) 
Cordeiro et al. (2012) 
Cordeiro et al. (2013) 


3 Right-tail heaviness 

Heavy right-tailed distributions have been used to model phenomena in economics, ecology, 
bibliometrics, and biometry, among others; see, for instance, Markovich (2007) and Resnick 
(2007). We next describe two criteria used to evaluate the right-tail heaviness of a distribution. 

Informally, a regular variation function is asymptotically equivalent to a power function. 
Formally, a Lebesgue measurable function U : —)■ is regularly varying at infinity with 

index p {U E RVp ), if lim U{tx)/U{t) = for x > 0. If p = 0, G is referred to as slowly 
varying. The function U varies rapidly at inhnity (or is rapidly varying at infinity with index 
00 {— 00 ), or U E RVoo {U E RV-oo) (de Haan, 1970, p. 4), if Vx 



f 0 

if 

0 < X < 1 

/ 


00 

if 

0 < X < l\ 

lim ■- - 1 

^nm . X < 


if 

X = 1 

lim 

t—^oo 

U(tx) _ 00 _ . 

Up) j 


if 

X = 1 



if 

X > 1. 

\ 


lo 

if 

X > 1. y 


A distribution with cdf R is said to have a heavy right tail whenever the survival function, 
F := 1 — F, is a regularly varying at infinity function with a negative index of regular variation 
p = —1/^, .^>0, i.e., lim F{tx)/F{t) = The parameter ^ is known as the tail index, 

t—>-oo 
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and is one of the primary parameters of rare events. The distribution is said to have light 
(non-heavy) right tail if the limit equals x~°°, and = 0- When the limit equals 1, i.e. F is a 
slowly varying function, we will say that the distribution has a heavy right tail with tail index 
^ = oo. It follows from de Haan (1970, Corollary 1.2.1 - 2 and 3) that the index of regular 
variation is invariant under the location-scale transformation. It is thus sufficient to derive it 
from the standard form of the distribution. 

The generalized extreme value distribution GEV(/i, a, a) has a heavy right tail with tail 
index a when a > 0 (Frechet family). It has a non-heavy right tail when a = 0 (Gumbel fam¬ 
ily). Other heavy right-tailed distributions are, e.g., Student-t(z/) (.^ = 1/zz), Cauchy = 1) 
and F(q!,/9) = 2/(3). The other distributions addressed in this paper®, viz., Gumbel (FV), 

exponentiated Gumbel (FGu), transmuted extreme value (TFV), generalized three-parameter 
Gumbel (GTIFV3), three-parameter exponential-gamma (FGa), type IV generalized logistic 
(GLIV), generalized Gumbel distribution (GGu), and two-component extreme value distribu¬ 
tion (TCFV), are all non-heavy right-tailed distributions (see Supplement). Hence, among the 
identihable distributions addressed in this work, the GFV distribution is the only one with a 
potentially heavier right tail than that of the Gumbel distribution under the tail index approach. 

Rigby et al. (2014) ordered the heaviness of the tails of a continuous distribution based 
on the logarithm of the pdf. If random variables Xi and X 2 have continuous pdf fxi{x) and 
fx 2 {^) and lim fxi{x) = lim fx 2 i.^) = 0; then X 2 has a heavier right tail than Xi if and only 

x^oo x—^00 

if lim [In fx 2 {^) ~ fxi{x)] = 00 . There are three main forms for In fx{x) when x ^ 00 (right 

X^OO 

tail) or X —)■ —00 (left tail): \nfx{x) ^ —k2{\n\x\)’^^ (type I), \nfx{x) ^ (type II) or 

\nfx{x) ~ —feeexp(fc 5 |x|) (type III). The three types are in decreasing order of tail heaviness. 
For type I, decreasing ki results in a heavier tail while decreasing ^2 for hxed ki results in a 
heavier tail. Similarly, for the two types. If two distributions have the same values of ki and 
k 2 (analogously for k^ and k^, or k^ and ke), their right tails are not necessarily equally heavy. 
In this case, it is necessary to compare the second-order terms of the logarithm of the pdf to 
distinguish the distributions. 

Table 2 summarizes the right tail asymptotic form of the logarithm of the pdf for the 
distributions mentioned above. The GFV distribution with a > 0 is of type I with ki = 1 and 
^2 = 1 + 1/a. As expected, the GFV distribution is the only one that has a ‘Paretian type’ 
right tail (type I with ki = 1 and k 2 > 1). Note that the Cauchy distribution has ki = 1 
and /c 2 = 2, and hence if a > 1, the GEV distribution has a heavier right tail than that of 
the Cauchy distribution. The Student-t distribution with u degrees of freedom has ki = 1 and 
/c 2 = 1 + zz. If a > 1/2, the right tail heaviness of the GEV distribution is greater than that of 
the Student-t distribution with two degrees of freedom, which is uncommon in real data. 

The Gumbel distribution is of type II with fcs = 1 and k^ = 1/a. According to Rigby et al. 
(2014), distributions with k^ = 1 are non-heavy tailed. All of the other distributions are also 

^Recall that we restrict our attention to the identifiable family of distributions only. 
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of type II and non-heavy tailed distributions. The EGu, EGa, GGu, and GLIV distributions 
have /c 4 = a/a, and hence they have heavier right tail than the Gumbel distribution when 
a < 1. The TEV, GTIEV3, and TGEV distributions have the same ^4 = 1/a as the Gumbel 
distribution. To distinguish among the TEV, GTIEV3 and TGEV distributions it is necessary 
to compare the second-order terms of the logarithm of their pdf. Gomparing the second-order 
terms, the TEV distribution with a < 0 and the TGEV distribution have heavier right tail 
than the Gumbel distribution. The right tail of the GTIEV3 distribution is lighter than that 
of the Gumbel distribution (see Supplement). These Endings agree with the pdf plots shown 
in Figures 3, 4, and 8 , respectively. 

Table 2: Right tail asymptotic form of the logarithm of the pdf for the Gumbel distribution 
and its generalizations 



GEV 

EGu 

EGa 

GLIV 

GGu 

EV 

GTIEV3 

TEV 

TGEV 

parameter 

a > 0 

a > 0 

a > 0 

a > 0 

a > 0 


a > 0 

-1 < a < 1 

0 < a < 0.5 

h 

1 

a > 0 








k2 

1 1/a 









h 


1 

1 

1 

1 

1 

1 

1 

1 

ki 


aja 

aja 

aja 

aja 

1/a 

1/a 

1/a 

l/a2 * 


if ai < a2- 


4 Monte Carlo simulation results 

We next compare the ability of the Gumbel distribution and its generalizations to model data 
taken from difierent distributions. To this end, we present a Monte Garlo simulation study in 
which 10,000 samples of size 500 are generated from and modeled with each of the identifiable 
distributions considered in this paper. For generating the data, we set p = 0 and a = 1 (for 
the TGEV distribution pi = 10 and ai = 5). The remaining parameters were chosen in such a 
way that the .999 quantile is close to 10 whenever possible. Table 3 presents the distributions 
from which the samples were drawn and their .999 quantile and kurtosis. 

As measures of model adequacy, we use the Akaike information criterion (AIG) and two 
modified Anderson Darling statistics (ADR and AD2R). AIG is a measure of dissimilarity 
between two distributions over the support; smaller AIG suggests that the fit is closer to the 
true density. ADR and AD2R (Luceno, 2005, Table 2 and B.l) are sensitive to the lack of fit in 
the right tail of the distribution. AD2R puts more weight in the right tail than ADR. Smaller 
values of ADR and AD2R are indicative of a better fit. 
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Table 3: Distributions, .999 quantile, and kurtosis 



EV 

GEV 

EGu 

TEV 

EGa 

GGu 

GLIV 

TCEV 

Parameters 

- 

a = 0.1 

P 

II 

o 

a = —0.99 

a = 0.7 

P 

II 

o 

a = 0.65 

a = 0.0016 








/3 = 15 

pi = 10, (Ti = 5 

a:.999 

6.91 

9.95 

9.87 

7.60 

9.99 

9.87 

10.38 

10.28 

Kurtosis 

5.40 

10.98 

6.28 

5.39 

6.22 

5.72 

6.26 

5.38 


A characteristic that is often of interest in extreme data modeling is the return level. The 
return level with return period 1/p is the quantile xi_p, and it is interpreted as the value that 
we expect to be exceeded once every 1/p periods on average. To evaluate the quantile goodness 
of fit, we compute the .999 quantile discrepancy, which is defined as the difference between the 
.999 quantile of the htted model and the .999 quantile of the distribution from which samples 
are generated divided by the latter. 

The estimates of the parameters were obtained by numerically maximizing the log-likelihood 
function. For the maximization procedure, we used a method that implements a sequential 
quadratic programming technique to maximize a nonlinear function subject to non-linear con¬ 
straints, similar to Algorithm 18.7 in Wright & Nocedal (1999). This method is implemented 
in the function MaxSQP in the matrix programming language Ox (Doornik, 2013) and allows to 
establish bounds for the individual parameters. For the TEV distribution, we used the profile 
log-likelihood function for the parameter a. We used the bounds —0.6 < a < 0.6, for the 
GEV distribution, and 0 < a < 1 and 0 < /3 < 20, for the GLIV distribution. The GTIEV3 
distribution is not included in the simulation study because it behaves like the EV distribution 
with respect to the right tail. 

Figures 9 and 10 present the boxplots of AIG, ADR, AD2R, and the .999 quantile discrep¬ 
ancies of the fitted models. For these figures, the samples were generated from the EV(0,1) 
and the GEV(0,1,0.1) distributions, respectively. The figures, corresponding to the cases 
where the samples were generated from the EGu(0,1, 0.7), TEV(0,1, —0.99), EGa(0,1, 0.7), 
GGu(0,1, 0.7), GLIV(0,1, 0.65,15) and TGEV(0,1,10, 5, 0.0016) distributions, are presented 
in the Supplement. For the GEV distribution, we show results for two estimation methods: 
maximum likelihood estimation (GEV-MLE) and probability-weighted moments (GEV-PWM) 
methods (Gastillo et ah, 2005, Section 5.3). 

For the Gumbel distributed samples (Figure 9), the boxplots of AIG are quite similar and 
suggest that all generalizations of the Gumbel distribution can suitably ht Gumbel distributed 
data. The goodness of fit at the right tail, illustrated by the boxplots of ADR, are also quite 
similar except for the Gumbel and the GLIV distributions. For the Gumbel distribution, the 
median and the dispersion are bigger than for the other distributions. The GLIV fit is poor in 
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the right tail, which is consistent with the quantile plot in Figure 7, that suggests that a small 
difference in the estimated parameter a produces a significant difference in the upper quantiles. 
This characteristic makes the right tail goodness of fit dependent on the precision adopted for 
the parameter estimation. Boxplots of AD2R emphasize the right-tail lack of fit. The boxplots 
of AD2R in Figure 9 are similar except for the GLIV fit, whose interquartile range (IQR) is the 
largest. The GEV-MLE right tail fit seems to be slightly better than the others. The boxplots 
of 0.999 quantile discrepancy show that the median is close to zero for all the distributions. 
The EV and GGu fits exhibit the smallest amplitudes. The TEV fit presents some cases of 
marked underestimation due to numerical problems. 
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Figure 9: Boxplots of AIG (first row), ADR (second row), AD2R (third row) and .999 quantile 
discrepancies (fourth row) - random samples generated from EV. 

For the GEV distributed samples (Figure 10), the boxplots of AIG of the EV and TEV 
fits present the biggest medians, and the GEV-MLE, GEV-PWM, and TGEV fits exhibit the 
smallest amplitudes. We recall that, theoretically, the AIG of the GEV-MLE fit can not be 
bigger than that of the GEV-PWM fit. The boxplots of ADR and AD2R highlight the GEV- 
PWM fit as the best right-tail fit. The boxplots of quantile discrepancy show that the GEV- 
PWM and GEV-MLE fits have the closest to zero medians, and all the others underestimate 
the .999 quantile, markedly the EV and TEV fits. 

Hereafter we analyze the fits when the data were generated from the other distributions 
(see boxplots in the Supplement). The boxplots of AIG are similar to those in Figure 9. The 
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Figure 10: Boxplots of AIC (first row), ADR (second row), AD2R (third row) and .999 quantile 
discrepancies (fourth row) - Random samples generated from GEV. 

boxplots of ADR for the TCEV ht exhibit the smallest medians and IQR followed by the GEV- 
PWM ht except for samples generated from the TCEV distribution when it is followed by the 
EGu, EGa, and GEV-MLE hts. The boxplots of AD2R reveal that the GEV-MLE ht is among 
the best hts regardless of from which distribution the data were generated. When generating 
the data from the GLIV distribution, all the distributions underestimate the .999 quantile, and 
this is the only case where the GEV-MLE and GEV-PWM hts underestimate the .999 quantile. 

Summing up, the GEV-PWM, TCEV, and GEV-MLE hts are among the best hts in all of 
the simulated settings. 

5 Application to a wind speed data set 

We analyze data on the maximum monthly peak gust wind speed (mph) in West Palm Beach, 
Florida (USA) for the months January, 1984 to November, 2014, with n = 371 observations. 
The data are available online for download from the National Climate Data Center (NCDC) 
- National Oceanic and Atmospheric Administration (NOA A) at http://www.ncdc.noaa.gov/^ 
and given in the Supplement. We ht the diherent models described in Section 2 to the seasonally 

^ More specifically, http://www.ncdc.noaa.gov/ —^ I want to search for data at a particular location. 
—^ Additional Data Access: Publications —^ Local Climatological Data. (Last accessed on January 23, 2014) 
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adjusted wind speed data. The seasonally adjusted wind speed was calculated removing the 
seasonal component using a robust seasonal trend decomposition implemented in the functions 
stl of the R package stats and seasadj of the package forecast (Cleveland et ah, 1990). 
The maximum likelihood estimates of the parameters were obtained similarly as in the Monte 
Carlo simulation (Section 4). 

Figure 11 shows the scatterplot, the adjusted boxplot for asymmetric distributions 
(Hubert & Vandervieren, 2008) and the histogram of the data together with the htted den¬ 
sities. The outliers at the right tail of the adjusted boxplot are much more spread than those 
at the left one. The empirical skewness and kurtosis coefficients are 2.26 and 13.42, respec¬ 
tively. Both are much higher than those expected from a Gumbel distribution (1.14 and 5.4, 
respectively), which suggests the htting of the generalized distributions. Recall that the only 
distribution for which the skewness can be higher than 2 and the kurtosis can be higher than 
9 is the GEV distribution. 



month maximum wind speed(m/h) 


Figure 11: Scatterplot, boxplot, adjusted boxplot, and histogram; wind speed data. 

The maximum likelihood estimates and the probability weighted moments estimates for the 
GEV distribution (standard errors in parentheses) and the estimated return levels of the season¬ 
ally adjusted series of the return period of one thousand months from the selected distributions 
are summarized in Table 4. The additional parameter a estimate of the GTIEV3 distribution 
is notably high, i.e., the htted GTIEV3 distribution nearly coincides with the htted Gumbel 
distribution. Recall that the Gumbel distribution is a limiting case of the GTIEV3 distribution 
when a —)■ oo. Indeed, estimates of n and a for these distributions are the same up to the 
third decimal places. The estimated return levels of a return period of one thousand months 
(for the seasonally adjusted data) from the selected distributions diher by up to 20 mph. The 
Gumbel distribution produces the smallest estimated return level (77.23 mph), and the largest 
estimates are obtained from the GEV-PWM (89.52 mph) and TGEV (100.21 mph) hts. 

Figure 12 displays the prohle log-likelihood function for the additional parameter(s) of 
the htted models. The prohle log-likelihood function is well behaved if there is no inhection 
point, multimodality or lack of concavity. Note the slight concavity in the prohle log-likelihood 
function for the EGu, EGa, and GGu models. The prohle log-likelihood function for the 
TEV model exhibits an inhection point, a local minimum, and two local maxima, and hence 
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Table 4: Parameter estimates, standard errors (in parentheses) and return level for a thousand 
years - wind speed data. _ 



A 

a 

a 

/3 or Ai 

ST 

2 ;.999 

EV 

36.94(0.32) 

5.83(0.24) 

- 

- 

- 

77.23 

GEV-MLE 

36.75(0.34) 

5.72(0.25) 

0.06(0.03) 

- 

- 

85.74 

GEV-PWM 

36.67(0.34) 

5.53(0.25) 

0.09(0.03) 

- 

- 

89.52 

EGu 

35.78(0.95) 

5.07(0.63) 

0.79(0.16) 

- 

- 

80.34 

TEV 

39.19(0.35) 

6.95(0.29) 

0.61(0.16) 

- 

- 

80.57 

GTIEV3 

36.94(0.29) 

5.83(0.22) 

16698.18(-) 

- 

- 

77.23 

EGa 

35.10(1.24) 

4.89(0.68) 

0.75(0.16) 

- 

- 

80.83 

GGU 

36.89(0.38) 

6.07(0.43) 

1.03(0.11) 

- 

- 

77.69 

GLIV 

36.67(0.48) 

2.94(1.00) 

0.43(0.17) 

2.16(1.95) 

- 

81.98 

TGEV 

36.70(0.37) 

5.51(0.30) 

0.02(0.02) 

61.56(23.41) 

13.38(8.22) 

100.21 


maximization can converge to a local maximum depending on the initial value. The profile log- 
likelihood function for the GTIEV3 model is increasing and flat for large values of a. Hence, 
the estimate of a depends on the numerical precision specihed for the maximization algorithm, 
and the standard error estimate is very large. The profile log-likelihood function for the GLIV 
model also varies slowly in the f3 parameter direction. The profile log-likelihood function for the 
TGEV model presents inflection points. This appears not to disturb the parameter estimation 
because of the highly concave profile log-likelihood function near its maximum. The profile 
log-likelihood function for the GEV model is well behaved. 



Figure 12: Profile log-likelihood function - wind speed data. 


A summary of the goodness of fit measures is given in Table 5. The GEV-MLE, GEV- 
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PWM, TEV, GLIV, and TCEV fits produce smaller AIC and ADR measures than the EV fit. 
The AD2R measure highlights the EV and the GTIEV3 hts (235.75 and 235.82, respectively) 
as the worst fits. It points to the TGEV (2.33) and the GEV-PWM (7.13) as the best hts. All 
the goodness of ht measures reveal that the Gumbel distribution is not the best choice for this 
data set. Taking all of the goodness of ht criteria into account, we conclude that the GEV and 
TGEV distributions produce the best hts. 


Table 5: Goodness of ht measures - wind speed data. 



EV 

GEV 

MLE 

GEV 

PWM 

EGu 

TEV 

GTIEV3 

EGa 

GGu 

GLIV 

TCEV 

-m 

1245.08 

1242.98 

1243.69 

1244.34 

1242.59 

1245.08 

1244.23 

1245.20 

1243.04 

1241.24 

AIC 

2494.15 

2491.96 

2493.39 

2494.67 

2491.18 

2496.15 

2494.47 

2496.40 

2494.08 

2492.47 

ADR 

0.58 

0.46 

0.39 

0.55 

0.37 

0.58 

0.54 

0.65 

0.35 

0.31 

AD2R 

235.75 

15.84 

7.13 

91.88 

70.75 

235.82 

84.84 

205.59 

56.41 

2.33 


Figure 13 shows the qqplots of the htted models. As an aid to interpretation, envelopes 
were generated by simulation. The envelopes correspond to pointwise two-sided 90% conhdence 
intervals with the bootstrap replicates of each curve generated from the htted model. The 
qqplots for the Gumbel, EGu, TEV, GTIEV3, EGa, GGu, and GLIV distributions clearly 
suggest a lack of ht at the extreme of the right tail. However, the qqplots for the GEV 
(both estimation methods, MLE and PWM) and TGEV distributions accommodate all of the 
observations of the right tail inside the envelope. Therefore, qqplots corroborate the previous 
conclusions that the GEV and TGEV models provide the best hts for this data set. 


6 Conclusion 

Motivated by real problems with a probability of extreme events that is larger than usual, we in¬ 
vestigated distributions that generalize the Gumbel extreme value distribution, frequently used 
to model extreme value phenomena. We showed that some generalized Gumbel distributions 
proposed in the literature are nonidentihable, which limits their usefulness in applications. We 
gathered the moments, quantiles, generating data methods, skewness and kurtosis coefficients, 
and classihed their right-tail heaviness according to two criteria. We provided a simulation 
study to evaluate the capacity of the selected distributions to ht data with kurtosis larger 
than that of the Gumbel distribution. Our simulation results revealed that the generalized 
extreme value (GEV) distribution is more hexible in htting this type of data and that the 
two-component extreme value (TGEV) distribution can also be a good choice. An application 
to an extreme wind speed data set in Florida conhrmed the simulation study, with the GEV 
and TGEV models providing better hts than the other distributions. 
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Figure 13: QQplots - wind speed data. 

As indicated by our simulations, practitioners should consider the GEV and TCEV distri¬ 
butions to model extreme value data with a heavy right tail. 
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Supplement 

A comparative review of generalizations of 
the Gumbel extreme value distribution 
with an application to wind speed data 


1 Moments and quantiles 


The skewness (71) and kurtosis (72) coefficients of the distributions are obtained from the central 
moments (E((X — E(X))”)) or the moments (E(X"')), and the equations 

E{{X-E{X)f) E(W)-3E(X)E(X2) + 2E(X)3 

“ [E((X -E(X))2)]W “ (E(X2) - E(X)2)3/2 ’ 

E((X-E(X))^) _ E(X4) - 4E(X)E(X3) + 6E(X)2e(X2) + 3E(X)4 
“ [E((X-E(X))2)]2 “ (E(X2) -E(X)2)2 • 

Eor the GEV distribution, we used that 



exp{—nx)dx 




if Re /i > 0 and Re i/ > 0 (Gradshteyn &: Ryzhik, 2000, equation 3.381.4). 
Eor the standard EGu distribution, the moment of order n is 


roo 

E(Xn= / a(-ln(-lny))-2Ei(l 
Jo 




Eor each value of a, the n-th moment can be obtained by numerical integration with the computer 
algebra software Mathematica (Wolfram Research, 2012) as follows: 


Clear[a];Clear[EXl];Clear[EX2];Clear[EX3];Clear[EX4] ; 

Clear [skewness];Clear[kurtosis] ; 

EXl: =-a*NIntegrate [(Log[-Log[y] ]) ''l*Hypergeonietric2Fl [(1-a), 1,1 ,y] ,{y.0,1}] 
EX2: =a*NIntegrate [(Log[-Log[y] ]) ''2*Hypergeonietric2Fl [(1-a) , 1,1 ,y] ,{y,0,1}] 
EX3: =-a*NIntegrate [(Log[-Log[y] ]) ''3*Hypergeometric2Fl [(1-a) , 1,1 ,y] ,{y,0, If] 
EX4: =a*NIntegrate [(Log[-Log[y] ]) ''4*Hypergeonietric2Fl [(1-a) , 1,1 ,y] ,{y,0,1}] 
a:=Table[i/100,{i,200>] 

skewness=(EX3-3*EX2*EXl+2*EXl~3)/(EX2-EXl''2)~(3/2)//N 
Clear[a];a := Table [i/100, {i, 300}] 

kurtosis = (EX4 - 4*EX3*EX1 + 6*EX2*EX1~2 - 3*EX1‘4)/(EX2 - EX1~2)~2//N 


1 







For the TEV distribution, we used the moments in Aryal & Tsokos (2009, p. 1404). For the 
EGa distribution, we used that 


/ 


oo 


X‘ 


exp(-/ix)(lna:)’"dx = , 


for n = 0,1,2,3,... (Gradshteyn &: Ryzhik, 2000, equation 4.358.5), where F(a:) = 


exp^dt and Re a; > 0 is the Gamma function. For the GLIV distribution, we 


used its moment generating function. For the GGu distribution, the central moment of order 


poo 

E((A'-E(X)n= / (-ff (In lii(l+»-'/“)+£(1, a)))” (l+ 9 )-Ui,, 
Jo 


where £{n, a) = — /g°°(lnln(l + |/“^/“))"(l + y)~^dy and 8{1, 1) = is the Euler’s constant. 
For each value of a, the skewness and kurtosis can be obtained by numerical integration with 
the computer algebra software Mathematica (Wolfram Research, 2012) as follows: 

Clear[alpha] ; Clear[El]; Clear [E2]; Clear[E3]; Clear[E4] 
alpha = Table [i/10, {i, 7, 100}] 

El = -Nlntegrate[Log[Log[1 + (-1/alpha)]]/(I + x)"2, {x, 0, Infinity}] 

E2 = -Nlntegrate[Log[Log[1 + x~(-1/alpha)]]*2/(1 + x)*2, {x, 0, Infinity}] 

E3 = -Nlntegrate[Log[Log[1 + x*(-1/alpha)]]*3/(1 + x)*2, {x, 0, Infinity}] 

E4 = -Nlntegrate[Log[Log[1 + x*(-1/alpha)]]*4/(1 + x)*2, {x, 0, Infinity}] 

skewness = -(-E3 - 3 E2*E1 - 2 El*3)/(-E2 - El*2)*(3/2) 
kurtosis = (-E4 - 4 E3*E1 - 6 E2*E1*2 - 3 El*4)/(-E2 - El*2)*2 

Table 1 presents moments, skewness and kurtosis coefficients and quantile functions for the 
Gumbel distribution (EV) and its generalizations. 

2 Right tail heaviness 

Regular variation theory criterion. To obtain the index of regular variation, we use that 



( 3 ) 


exp(-exp(-(tx-/x)/cr)) ^ 
t^oo exp{—exp{—(t — fi)/a)) 


( 4 ) 


and 



0 < x < 1 
X = 1 
X > 1. 


2 






Table 1: Moments, skewness and kurtosis coefficients and quantile functions 


GEV(/J,, cr,a), a ^ 0 
EGu(/j,, a, a) 

TEV(/i, a, a) 
GTIEV3(/i, cr, a) 
EGa(^, cr, Of) 

GGu(/i, cr, Of) 

GLIV(^, cr, a, /?) 
TCEV(/i,cr,/Ci,cri,Q;) 


_E(X)_ 

fi + a£ 

^ + (cr/a) (r(l — cr) — 1) , a < 1 

(/i + £a) — a<j In 2 
^ — cr{—£ + ln(a) — ij^ia)) 

/c — atp{a) 

+ (j£(l, a) 

fi + cr(b(/3) - '(/’(«) - ln(/3/a)) 

(1 — a)(/x + a£) + a{fii + ai£) 


EV(/i,a) 

GEV(/c, cr. Of), a ^ Q 
EGu(/i, cr, a) 


Quantile Xp 
crln(-ln(p)) 

Ai+ (cr/a)((- ln(p))-“ - 1 ) 
/i — cr ln(— ln(l — (1 — 


TEV(^, cr, a) 


/i — crln In ^(1 + a — ^(1 + cr)^ — 4ap)/2Qf^^ 


GTIEV3(/i,cr, a) 
EGa(/i, cr, a) 
GGu(/j,, cr, a) 


^-crln(a(p-T“ - 1)) 
fi - crlnln(l + (p/(l - 


GLIV(/x, a, a, 13) 


TGEV(^,CT,/ii,cri,a) 


EV(Ai,a) 

GEV(/i, cr, cr), a ^ 0 
EGu(/i, cr, a) 

TEV(/i, cr, a) 
GTIEV3(/c,cr, a) 
EGa(/i, cr, a) 

GGu(/j,, cr, a) 

GLIV(^, cr, a, /?) 
TGEV(/c,cr,/ci,cri,Q;) 


Skewness 71 

12V6C(3)/7r3 = 1.139547 
_|_r(l-3a)-3r(l-2a)r(l-a)+2r3(l-0!) 

(r(l- 2 a)-r 2 (l-a ))^/2 


l-((ln 2)3/2C(3))a(l+a)(l+2a) 
(l-6(ln2/7r)2a(l+a))3/2 


-£:(3,a)-3g(2,c^)g(l,Q)-2g^(l,a) 

—£( 2 ,Q()—52(1,Q() 

(r^"(/3)-b"(a))/(^'(/3) + V^'(a))'/" 


var(X) 
cr^ 7r^/6 

[aja)^ (r(l — 2a) — r^(l — a)) , 2a < 1 


cr^( 7 r ^/6 — a(l + a)(ln 2 )^) 

cr^(7r^/6 + ip'ia)) 

cr^b^lcr) 

cr^(—£( 2 , a) — £‘^{\, a)) 
cr2(?/;'(/3) + '0'(a)) 

(1 — a)cr^7r^/6 + acrj7r^/6+ 
a(l — a)(/j, + erf — /Cl — CTif)^ 


E(X") 


ELo (:)/i"-vn-i)TW(i) I" 

Er=o (") ^T(l - ac) 

«Er=o (i)M”"*o-*(-l)*/p (ln(-ln?/))* 2 Ei (1 -a,l,l,y)cZ?/ 


2-'^r(jc) 

— Oc— 1 


Er=o (:)m"-VH- 1)* (^(1 + a)rW(l) - 2 a|k 

Er=o(”)M”-V*(-l)7o(lny)*(l + (l/a)2/) ^ ^dy 

rMEr=o (:)/i"-V*(-l)TW(a) 

-Er=o (:)M”-V(-l)*fb,a) 


0” 

9t" 


r(a-t)r(/3+t) 
r(«)r(/3) 


EEo (:)(-l)^r(-)(l)(^(l - a)/c-V- + g/crvi^ 


Kurtosis 72 


5.4 

r(l-4a)-4r(l-30!)r(l-a)+6r(l-2a)r^(l-a)-3r*(l-0!) 

(r(l- 2 a)-r 2 (l-a))'' 


(.12,EV 


3) 


1 — 15(In 2/7r)^a(l+a) (l+6a(l+a)) 
(1—6(ln 2/7r)^a(l+a))^ 


(7"(a) + ^"'(l)(/(7(«)+7(l))" + 3 


ip'”(a) / 'tp'(a)'^ + 3 

-£(4,a)-4g(3,a)£(l,a)-6£(2,a)£:^(l,a)-3g*(l,a) 
—£■(2,Of) —5^(1,a) 

(b"'(/3) + ^"'(cr))/7'(/3)+V''(a))" + 3 


^ip(x) = d(\nT(x))/dx = r(^)(a:)/r(x) is the digamma function. is the n-th derivative of the digamma function (the n-th 

polygamma function); ip^^^x) = ip'(x). r*^"')(a;) = exp(—t)t““^(lnt)”c?t and Re a; > 0 is the n-th derivative of the Gamma function. 

^^C(s) = Efci ^”*1 Res > 1 is the Riemann zeta function and ^(3) ~ 1,2. 


























Let X ~ GEV(/i, (T, a) with a > 0. From (3), we have 

1 — F(tx) xfitx) 

t^oo 1 — F(t) t^OO f{t) 

iexp (- [1 + a [1 + a (=^)] 

““ iexp (- [1 + a [1 + a (^)] 

^-1/a 

i.e., the generalized extreme value distribution with a > 0 is regularly varying at inhnity with 
index —1/a and tail index a. If a = 0, from (4) and (5), we have 

(l/a) exp(— exp(—(te — fi)/(T)) exp{—(tx — /i)/cr) 
t^oo (1/cr) exp(— exp(—(f — ^Fj/cr)) exp(—(f — fi)/a) 

{ oo, if 0 < X < 1 
1, if X = 1 

0, if X > 1, 

i.e., the Gumbel distribution is rapidly varying at inhnity with index —oo. 

The index of regular variation of the other distributions can be obtained analogously. To 
obtain the index of regular variation of the TGEV(/i, a, pi, ai, a) distribution we used the 
software MATHEMATIGA (Wolfram Research, 2012) as follows: 

Clear[a];Clear[si];Clear[s2];Clear[x] 

Clear[a];Clear[si];Clear[s2];Clear[x] 

Limit[((((l + a)/sl)*Exp[-(t*x - ml)/sl]=t= 

Exp[-Exp[-(t*x - ml)/si]] - (a/s2)*Exp[-(t*x - m2)/s2]* 

Exp[-Exp[-(t*x - m2)/s2]])/(((1 + a)/si)*Exp[-(t - ml)/si]* 

Exp[-Exp[-(t - ml)/si]] - (a/s2)*Exp[-(t - m2)/s2]* 

Exp [-Exp [-(t - m2)/s2]])), {t -> Infinity}, 

Assumptions :>{0<a<l, 0<x<l, 0<sl<s2, ml< m2}] 

{\ [Infinity]} 

Clear[a];Clear[si];Clear[s2];Clear[x] 

Limit[((((l + a)/sl)*Exp[-(t*x - ml)/sl]* 

Exp[-Exp[-(t*x - ml)/si]] - (a/s2)*Exp[-(t*x - m2)/s2]* 

Exp[-Exp[-(t*x - m2)/s2]])/(((1 + a)/si)*Exp[-(t - ml)/si]* 

Exp [-Exp [-(t - ml)/si]] - (a/s2)*Exp[-(t - m2)/s2]* 

Exp [-Exp [-(t - m2)/s2]])), {t -> Infinity}, 

Assumptions :>{0<a<l, x>l, 0<sl< s2}] 

{0} 


limShM 

t^oo Fev(^) 


F 

hm — 

£—>-CX) p' 


GEV 


{tx) 


GEV 


it) 
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Criterion of Rigby et al. (2014). Let X ~ EV(/u,a). The logarithm of the pdf, given in 
the main article, is 


ln(/Ev(a:; /i, a, a)) = - ln((T) - ^ ^ - exp 


cr 


X — 

a 


1 

-X, 

cr 


as X ^ cxo. We have /cs = 1 and /c4 = 1/a. 

Let X ~ GEV(/i, cr, cr). The logarithm of the pdf, given in the main article, is 


ln(/GEv(x;/n, cr,cr)) = - In(cr) 

T 


a 


1 T cr 
+ 1 1 Inx, 


X — /i 

a 


cr 


+ 1 In 


1 T cr 


X — fi 

a 


as X —oo, if cr > 0. We have ki = 1 and k 2 = 1 + 1/a. 

Let X ~ EGn(/i, cr, cr). The logarithm of the pdf, given in the main article, is 


ln(/EGu(x;Ac,cr,cr)) = -In 


cr 


cr 


X — p 


cr 


— exp 


X — /i 


cr 


+ (cr — 1) In 


1 — exp 


exp 


X — II 


a 


1 cr — 1 cr 

JU JU - JU ^ 

a a a 


as X —)■ cxo. For the approximation above we used that exp(—(x —/i)/cr) —)■ 0 as x —)■ cxo and the 
Taylor series approximation exp(—?/) (1—2/) as c/ —)■ 0 and hence (1—exp (— exp (—(x — /i)/cr))) 
exp(—(x — /i)/cr) as X —)■ oo. Therefore, fcs = 1 and ^4 = a/a. 

Let X ~ TEV(/i, cr, cr). The logarithm of the pdf, given in the main article, is 


X — f-1 

ln(/TEv(x;/i,cr,cr)) = -ln(cr)---exp 


cr 


X — p 


cr 


In 


1 + cr — 2cr exp ( — exp 


X — /i 


cr 


1 

-X, 

cr 


as X 00. We have k^ = 1, /C4 = 1/cr. The fourth term of ln(/TEv(x; /i, cr)) tends to 
ln(l — cr) as X — )> 00. Therefore, as x —)■ 00, ln(/TEv(x; /r, cr, cr)) can be bigger or smaller than 
ln(/Ev(x; yW, cr)), since ln(l — cr) > 0 for cr < 0, and ln(l — cr) < 0 for cr > 0, i.e., the TEV 
distribution can have heavier or lighter right tail than the EV distribution. 

Let X ~ GTIEV3(yU., cr, cr). The logarithm of the pdf, given in the main article, is 

X — fJj 

ln(/GTiEV3(x;/i, cr, cr)) = -ln(cr)-(cr + l)ln 

cr 

,,, x-/i cr + 1 f X - 1 

~ — In(cr)-exp- ^ -x, 

cr cr \ a J a 



as X ^ cxo. For the hrst approximation we used that exp(—(x — /r)/cr) —)■ 0 as x ^ cxo and the 
Taylor series approximation ln(l + y/a) ~ y/a as // —)■ 0. We have k^ = 1 and ^4 = 1/a. For 
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a > 0, (a + 1 )/q! > 1. Thus, as x ^ cxo, ln(/GTiEV3(a;; o', ct)) is smaller than ln(/Ev(a:^; h, o')), 
i.e., the GTIEV3 distribution have lighter right tail than the EV distribution. 

Let X ~ EGa(/r, a, a). The logarithm of the pdf, given in the main article, is 

ln(/EGa(2:;/i,cr,a)) = - ln(r(a)(T) - — --exp(-- — ^--x, 

a \ a / a 

as x —oo. We have ^3 = 1 and = a/a. 

Let X ~ GGu(/i, cr, a). The logarithm of the pdf, given in the main article, is 
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as X — 00. For the approximation above we used that exp(—(x — p)/cr) —)■ 0 as x —?■ 00 and the 
Taylor series approximation exp(|/) ( 1 + 1 /) as ?/ —?■ 0 and hence (exp (exp (—(x — ^)/a)) — l) ~ 

exp(—(x — p)/cr) as X —00. Therefore, ^3 = 1 and = a/a. 

Let X ~ GLIV(/i, cr, a). The logarithm of the p.d.L, given in the main article, is 
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as X ^ cxo. We have ^3 = 1 and /C 4 = a/a. 

Let X ~ TGEV(/i, a, a). The logarithm of the pdf, given in the main article, can be written 
as 


ln(/TCEv(a;; h, O', a)) = - ln((Ti) 
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as X ^ cxo if (Ti < (72. We have = 1 and ^4 = 1/(72- When x —)■ cxo, the fourth term of 
ln(/TCEv(2^; <^) «)) tends to infinity. Thus, ln(/TCEv(3^; cr)) is bigger than \n{fEY{x; fi,a)), 
i.e., the TCEV distribution have heavier right tail than the EV distribution. 


3 Additional Monte Carlo simulation results 


Figures 1-6 present the boxplots of AIC, ADR, AD2R, and the quantile discrepancies of the 
fitted models, when the samples were generated from the exponentiated Gumbel EGu(0,l,0.6), 
transmuted extreme value TEV(0,l,-0.99), three parameter exponential-gamma EGa(0,l,0.6), 
generalized Gumbel GGu(0,1, 0.7), type IV generalized logistic GLIV(0,1,0.55,10) and two- 
component extreme value TCEV(0,1, 10,5,0.0125) distributions. Comments on these figures 
are given in Section 4 of the main article. 
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Figure 1: Boxplots of AIC (first row), ADR (second row), AD2R (third row) and quantile 
discrepancies (fourth row) - random samples generated from EGu. 


4 Application 

Table 2 presents the wind speed data. 
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Figure 2: Boxplots of AIC (first row), ADR (second row), AD2R (third row) and quantile 
discrepancies (fourth row) - random samples generated from TEV. 
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Figure 3: Boxplots of AIC (first row), ADR (second row), AD2R (third row) and quantile 
discrepancies (fourth row) - random samples generated from EGa. 
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Figure 4: Boxplots of AIC (first row), ADR (second row), AD2R (third row) and quantile 
discrepancies (fourth row) - random samples generated from GGu. 
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Figure 5: Boxplots of AIG (first row), ADR (second row), AD2R (third row) and quantile 
discrepancies (fourth row) - random samples generated from GLIV. 
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Figure 6: Boxplots of AIC (first row), ADR (second row), AD2R (third row) and quantile 
discrepancies (fourth row) - random samples generated from TCEV. 
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Table 2: Wind speed data. 
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